Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
                                            Some full text articles may not yet be available without a charge during the embargo (administrative interval).
                                        
                                        
                                        
                                            
                                                
                                             What is a DOI Number?
                                        
                                    
                                
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
- 
            Free, publicly-accessible full text available October 1, 2026
- 
            Free, publicly-accessible full text available July 10, 2026
- 
            Free, publicly-accessible full text available July 21, 2026
- 
            Free, publicly-accessible full text available July 20, 2026
- 
            Free, publicly-accessible full text available March 31, 2026
- 
            Free, publicly-accessible full text available March 31, 2026
- 
            Free, publicly-accessible full text available April 24, 2026
- 
            Instruction tuning is critical for adapting large language models (LLMs) to downstream tasks, and recent studies have demonstrated that small amounts of human-curated data can outperform larger datasets, challenging traditional data scaling laws. While LLM-based data quality rating systems offer a cost-effective alternative to human annotation, they often suffer from inaccuracies and biases, even in powerful models like GPT-4. In this work, we introduce DS2, a Diversity-aware Score curation method for Data Selection. By systematically modeling error patterns through a score transition matrix, DS2 corrects LLM-based scores and promotes diversity in the selected data samples. Our approach shows that a curated subset (just 3.3% of the original dataset) outperforms full-scale datasets (300k samples) across various machine-alignment benchmarks, and matches or surpasses human-aligned datasets such as LIMA with the same sample size (1k samples). These findings challenge conventional data scaling assumptions, highlighting that redundant, low-quality samples can degrade performance and reaffirming that "more can be less."more » « lessFree, publicly-accessible full text available April 24, 2026
- 
            Free, publicly-accessible full text available November 20, 2025
- 
            Free, publicly-accessible full text available December 10, 2025
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
